Systems and Methods for Acceleration and Optimization of Web Pages Access by Changing the Order of Resource Loading

ABSTRACT

A method for acceleration of access to a web page. The method comprises receiving a web page responsive to a request by a user; analyzing the received web page for possible acceleration improvements; generating a modified web page of the received web page using at least one of a plurality of acceleration techniques; providing the modified web page to the user, wherein the user experiences an accelerated access to the modified web page resulting from the execution of the at least one of a plurality of acceleration techniques; and storing the modified web page for use responsive to future user requests.

CROSS-REFERENCE TO RELATED APPLICATIONS

This patent application claims priority from U.S. provisional patentapplication 61/213,959 filed Aug. 3, 2009, and further from U.S.provisional patent application 61/308,951 filed Feb. 28, 2010, bothapplications assigned to common assignee and hereby incorporated byreference for all that they contain.

TECHNICAL FIELD

The present invention relates generally to accesses to web pages, andmore specifically to the acceleration and/or optimization of accessspeed to such web pages from the user's experience perspective.

BACKGROUND OF THE INVENTION

The traffic over the world-wide-web (WWW) using the Internet is growingrapidly as well as the complexity and size of the information moved fromsources of information to users of such information. Bottlenecks in themovement of data from the content suppliers to the users, delays thepassing of information and decreases the quality of the user'sexperience. Traffic is still expected to increase faster than theability to resolve data transfers over the Internet.

Prior art suggests a variety of ways in an attempt to accelerate webpage content delivery from a supplier of the content to the users.However, there are various deficiencies in the prior art still waitingto be overcome. It would be advantageous to overcome these limitations,as it would result in a better user experience and reduction of trafficload throughout the WWW. It would be further advantageous that suchsolutions be applicable with at least all popular web browsers and/orrequire neither a plug-in nor a specific browser configuration.

SUMMARY OF THE INVENTION

Certain embodiments of the invention include a system for accelerationof access to web pages. The system comprises a network interfaceenabling communication of one or more user nodes with one or more webservers over a network for accessing web pages stored in the one or moreweb servers; an acceleration server coupled to the network interface formodifying web pages retrieved from the one or more web servers using atleast one acceleration technique, the modified web pages acceleratingaccess to the web page to one or more user nodes; a first cacheconnected to the acceleration server and the one or more user nodes andoperative to cache information associated with requests directed fromthe one or more the user nodes to the acceleration server; a secondcache connected to the acceleration server and the one or more webservers and operative to cache information associated with requestsdirected from the one or more web servers to the acceleration server;and a memory coupled to the acceleration server and containing aplurality of instructions respective of the at least one accelerationtechnique.

Certain embodiments of the invention further include a method foracceleration of access to a web page. The method comprises receiving aweb page responsive to a request by a user; analyzing the received webpage for possible acceleration improvements; generating a modified webpage of the received web page using at least one of a plurality ofacceleration techniques; providing the modified web page to the user,wherein the user experiences an accelerated access to the modified webpage resulting from the execution of the at least one of a plurality ofacceleration techniques; and storing the modified web page for useresponsive to future user requests.

BRIEF DESCRIPTION OF THE DRAWINGS

The subject matter that is regarded as the invention is particularlypointed out and distinctly claimed in the claims at the conclusion ofthe specification. The foregoing and other objects, features, andadvantages of the invention will be apparent from the following detaileddescription taken in conjunction with the accompanying drawings.

FIG. 1 is a schematic block diagram of a system for acceleration of webpages access;

FIG. 2 is a schematic diagram of the data flow in a system foracceleration of web pages access;

FIG. 3 is a flowchart of the processing performed for the purpose ofgenerating web pages that accelerate access; and

FIGS. 4A, 4B, 4C and 4D are exemplary scripts of an accelerationtechnique.

DETAILED DESCRIPTION OF THE INVENTION

The embodiments disclosed by the invention are only examples of the manypossible advantageous uses and implementations of the innovativeteachings presented herein. In general, statements made in thespecification of the present application do not necessarily limit any ofthe various claimed inventions. Moreover, some statements may apply tosome inventive features but not to others. In general, unless otherwiseindicated, singular elements may be in plural and vice versa with noloss of generality. In the drawings, like numerals refer to like partsthrough several views.

In an exemplary embodiment of the invention, a web access accelerationsystem is placed in the path between the user nodes and the web serversand is responsible for integrating the acceleration mechanisms to theweb pages selected for acceleration. The methods for web accessacceleration include, for example, parallel loading of a Cascading StyleSheets (CSS) style of a web page, postponement of execution ofJavascript code of a web page, maintaining script context when modifyingthe DOM, causing items to be pre-fetched into a browser's cache,web-site and browser transparent pre-fetching, pre-fetching of resourcesof subsequent or other pages of a web site, pre-fetching of resources ofthe same web page, fetching linked pages on demand prior to link access,a path dependent delivery of a web page to a user, automatic generationof combined image containers, caching of dynamic data, intelligentcaching of resources, processing links in the background, and postponingof iframes.

FIG. 1 depicts an exemplary and non-limiting schematic block diagram ofa system 100 for acceleration of web pages access in accordance with anembodiment of the invention. To a network 110 there are connected one ormore web page servers 120, each providing content typically usingformatted documents using, for example, the hypertext markup language(HTML). The network may be a local area network (LAN), a wide areanetwork (WAN), a metro area network (MAN), the Internet, theworld-wide-web (WWW), the like, and any combination thereof. One or moreuser nodes 130 that are viewers of such web pages content are alsoconnected to the network. A user of a user node 130 typically browsesthe content using a web browser that is enabled to display the webpages. By using, for example but not by way of limitation, a uniformresource locator (URL) the browser is capable of accessing a desired webpage.

The network 110 is also connected a web page access accelerator (WPAA)140. In accordance with the invention instead of providing web pagecontent directly from a web page server, for example, a web page server120-1, to a user node, for example, a user node 130-1, traffic isdirected through the WPAA 140, when applicable, i.e., when configuredfor accelerated access. Accordingly, a request for web page content isdirected through the WPAA 140 that is equipped with various accelerationmechanisms as further detailed herein below. In one embodiment of thedisclosed invention, the web servers 120 are part of a server farm (notshown). In a further embodiment thereof, the WPAA 140 is provided aspart of the server farm. In yet another embodiment of the invention, theWPAA 140 is integrated as an integral part of a web page server 120.

FIG. 2 shows an exemplary and non-limiting schematic diagram of the dataflow in a system for acceleration of web pages access in an embodimentof the invention. In addition, the details of the structure of the WPAA140 are also shown. For simplicity reasons and without limiting thescope of the invention, the network interface is removed. However, anetwork type interface is the typical way for components of the networkto communicate with each other.

The WPAA 140 comprises an acceleration server 142 that is connected tothe storage 148. The storage 148 typically holds instructions for theexecution of methods, described herein below in more detail, that resultin accelerating the transfer of web pages content to a user wishing toaccess such content. Under the control of the acceleration server 142,there is a back-end cache (BEC) 144 connected to the acceleration server142 and to the one or more web page servers 120-1 through 120-n. The BEC144 handles requests directed from the acceleration server 142 to theone or more web page servers 120-1 through 120-n. By caching informationassociated with web servers' requests in the BEC 144, the overall accessto web page content is accelerated. Under the control of the server 142,there is a front-end cache (FEC) 146, connected the acceleration server142 and to the one or more user nodes 130-1 through 130-m. The FEC 146handles requests directed from the one or more user nodes 130-1 through130-m to the acceleration server 142. By caching information associatedwith user nodes' requests in the FEC 146, the overall access to web pagecontent is further accelerated.

FIG. 3 shows an exemplary and non-limiting flowchart 300 of theprocessing performed for the purpose of generating web pages thataccelerate access in accordance with an embodiment of the invention. InS310, a page is received, for example by the WPAA 140, in response to arequest to receive a web page from, for example, web page server 120.Optionally, in S320, the received web page is stored in a cache, forexample, in the BEC 144. In S330, the received web page is analyzed bythe acceleration server 142 to determine whether accelerationimprovements may be achieved. In S340, it is checked whetherimprovements were determined to be achievable, and if so executioncontinues with S350; otherwise execution continues with S360. In S350,the received web page is modified into a modified web page that containsone or more acceleration techniques discussed herein below in moredetail. In S360, the modified or the received web page is provided tothe user node 120 that requested the web page. Optionally, in S370 themodified web page or the received web page, as may be appropriate, isstored in a cache, for example FEC 146. In S380, it is checked whetheradditional pages are to be handled, and if so execution continues withS310; otherwise, execution terminates.

While reference is made hereinabove to web pages, it can equally referto portions of web pages, resources of a web page, and the like, withoutdeparting from the scope of the invention. In one embodiment of theinvention, the method disclosed above may be performed by the WPAA 140.In other embodiments of the invention, the method can be integrated in aweb page server such as web page server 120.

While the description hereinabove was made with respect to oneparticular system, other systems may be deployed to benefit from theteachings hereinabove and herein below. In one exemplary andnon-limiting embodiment of the invention, a system that works as aplug-in/filter/extension to one or more web servers is used. The flow ofdata through the system is the same as described with respect of thesystem in FIG. 1, however, it may also utilize knowledge about the datastored on the web site, such as but not limited to, page templates andimages. In yet another exemplary and non-limiting embodiment of theinvention, the system is a plug-in for web site integrated developmentenvironment (IDE). Using the plug-in, the inventions herein areintegrated into the web site during its development. The plug-intherefore enables at “compilation” or “build” process of the IDE,changes to the web site coding made by the user of the web sitedeveloper according to the inventions. This may take place duringdevelopment or automatically implemented during development. In yetanother exemplary and non-limiting embodiment a utility containing, forexample and without limitation, a command line component, a userinterface (UI) component or any other interface, is run on the designedweb site code after it is ready, and/or in one or more points-in-timeduring the development thereof, to transform the web site code byemploying the inventions herein.

Following are descriptions of acceleration techniques used with respectto, for example, S350, discussed above. However, the use of suchtechniques may be a part of other embodiments which are specificallyincluded herein.

I. Parallel Loading of a CSS Style of a Web Page

Web pages may include one or more style parts, which allow theseparation of the content of the web page from its presentation. Thestyle can be changed and cause the page to look entirely differently,despite the fact that it contains the exact same content. The CascadingStyle Sheet (CSS) is the mechanism that allows doing so in HTMLdocuments. CSS is a “language” that a browser can interpret to renderthe display the page. Attaching a style to a HTML page can be done byeither embedding the text of the style inside the HTML document, in oneplace or dividing the text to several parts and embedding them indifferent places of the HTML document or putting the text of the stylein an external file and putting a directive inside the HTML document toload this file and to use the style definitions in it. Style definitionscan be very large (e.g., hundreds of kilobytes), especially if athird-party standard file is used and both abovementioned ways have thesame disadvantage. While the data of the style is being loaded, theparsing and processing of the page is halted and resumed only after thestyle data has been loaded and processed. Separating style definitionsto several parts helps to spread this delay all over the document, butthe overall delay remains.

In accordance with certain aspects of the invention, the problem isovercome by forcing the style data to load in parallel to the rest ofthe data. This is achieve by moving the style data from their originalposition, embedded into the HTML and/or taken from external file(s), toone or more external files which can be located anywhere. The HTML isthen changed to load these new external files in any asynchronous way,as further discussed in “Techniques of bringing Items to the Browser'sCache” herein below. During the loading process, after such a change,the browser of a user node 130, is unaware that the external filescontain style data and treats the external files as merely containingraw data. For every one of these external files, after its loading isfinished, which is determined differently for every fetch alternative, anew tag is dynamically inserted into the document. The tag is notinserted into the text of the document, but into the logicalrepresentation thereof, which is kept by the browser as a documentobject model (DOM). This tag instructs the browser to apply a new style,which is located in the same file loaded previously, in parallel toother loads, thereby saving on access time. It should be noted that theapplication of the style remains serial, however, as this file wasalready loaded and resides in the browser's cache on user node 130, itis being read from there and a new request is not being sent to fetchit. This way, the loading of the style data is done in parallel tofetching other data items and, though it does occupy some of thebandwidth, it does not delay the loading and processing of the HTML pageand its resources, by increasing parallelism of the operation. Resourcesof an HTML page include, but are not limited to, stylesheet files,Javascript and other script files, images, video and any other parts ofthe pages which are not embedded in the HTML.

In one embodiment of the invention, a post processing tool parses a webpage prepared by a developer and transforms it into a parallel loadingcapable web page based on the principle described above. In anotherembodiment, the WPAA 140 intercepts the web page and parses it prior tosending it out to the user. The original web page may reside in the BEC144. The acceleration server 142 based on instructions contained in thestorage 148 parses the web page in accordance with the inventiondescribed above and provides to the user a parallel loading capable webpage, which may also be stored in FEC 146 for future use by other usernodes 130.

II. Postponing Execution of Javascript Code in a Web Page

Typical web browsers are capable of handling scripting languages whichallow them to make the web pages more interactive, rather than justcontain text and media. One of the more popular scripting languages,supported by practically all known browsers, is Javascript. Javascriptcode may be embedded into the HTML page in one or more places, and/orloaded from one or more external files. Just like with stylesheets,discussed hereinabove, loading and running Javascript is done seriallyto the rest of the processing of the web page. Thus, loading and runningJavascript code decreases the speed in which the whole webpage isloaded.

Realizing that most of the Javascript code is used for “behind thescenes” functionality and does not contribute to the way the webpagelooks like. Thus, it would be better to load and run the Javascriptafter the visible portion of the web page has been downloaded and shown.According to an embodiment of the invention, the HTML page is scannedfor script tags and then moved to a later place in the HTML page. Thislocation can be at the end of the document, but is not limited thereto.Moving of the tags can be done by actually moving them, or otherwise,adding a “defer” attribute on the tags, which defers the respectiveJavascript execution to a later point. When moving the tags, it isimportant to keep the order between them to ensure proper execution.Many times a Javascript tag relies on pieces of code that were definedor executed in one or more of the tags before it.

It should be noted that the Javascript code may be sensitive to itslocation in the HTML page, thus a straightforward movement of the scripttag may not be suitable. In such a case, the original position of thescript in the page is marked by either a tag with a unique “id”attribute or in any other way. At a later position in the page, therespective code is “injected” into its original position, i.e., in theDOM.

A non-limiting sequence for postponing the execution of Java script codewould be: while processing the page, for example, by the WPAA 140,marking the script tag location by a marker and moving the script tagcontent, which can be a code or a link to an external file containingthe code, to a later position, wrapped by additional code, and whilemaintaining the order of the tags; and when the page is processed by thebrowser of a user node 130, the original position of the script isprocessed without a delay and when the browser reaches the new positionof the code, it triggers the wrapper previously inserted there. Thewrapper writes the original code at its original position in the DOM.This automatically causes the browser to run the code, but in thecontext of its original position.

In one embodiment of the invention, a post-processing tool parses a webpage prepared by a developer for Javascripts and moves them inaccordance with the principle described above. In another embodiment,the WPAA 140 intercepts the web page and parses it prior to sending itout to a user node 130. The original web page may reside in the BEC 144.The acceleration server 142 based on instructions contained in thestorage 148 parses the web page in accordance with the inventiondescribed above and moves Javascripts of the modified web page, whichmay also be stored in the FEC 146 for future use by other user nodes130.

While the description above was made with respect to Javascript, itshould not be viewed as restricting the scope of the invention which isrelevant for any browser scripting language, including but not limitedto, VBscript, Silverlight™, and Flash.

III. Maintaining Script Context when Modifying the DOM

Executing scripts may introduce new content into the web page bymodifying the respective DOM. Many times this is performed under theassumption that when the script runs, the parsing of the page by thebrowser reached only the script's position. Thus, the script may usebrowser functions like “document.write( )” and “document.writeIn( )” tointroduce the new content. Typically, these functions write the newcontent to the current parsing position of the browser just after theposition of the script tag which is reached. However, if these functionsare executed from another location, they modify the DOM in a differentway than originally intended. If they are run after the web page hasfinished loading, they overwrite the entire web page, as the parsingposition these functions use is brought to the beginning of the pageonce it finished loading.

According to an embodiment of the invention, the problematic functionsare overwritten so that instead of writing the new content into thecurrent parsing position, the new functions write it into, or after ifapplicable, the original position of the script tag. Inside these newfunctions, the text passed to the function is converted to a sub-tree ofthe DOM. The original document.write( )and other similar functions do itthemselves. Then, the new sub-tree is inserted into the DOM to therequired location previously marked, for example, by a unique “id”attribute. For some browsers, the original script content is insertedbut not executed, so in one embodiment an additional step is requiredwhere the browser is instructed to execute the code.

In one embodiment of the invention, a post-processing tool parses a webpage prepared by a developer for tagging the scripts in accordance withthe principle described above. In another embodiment, the WPAA 140intercepts the web page and parses it prior to sending it out to a usernode 130. The original web page may reside in the BEC 144. Theacceleration server 142 based on instructions contained in the storage148 parses the web page in accordance with the invention described aboveand tags scripts in the modified web page, which may also be stored inthe FEC 146 for future use by other user nodes 130

IV. Acceleration Technique for Running Scripts Outside of TheirPositions in a Web Page File

One of the web time loading acceleration techniques is to move <script>tags to the end of the document. This way running of scripts, which cantake a long time, does not slowdown the rendering of the page. Manyscripts are written to be aware of their position in the web page. Forexample, some scripts create images and Flash components at the sameplace where they are located. Thus, moving such scripts to anotherlocation, thereby stopping them from slowing down the page loading,causes these components to be written to the page in the wrong place.

According to an embodiment of invention, the script writes everything tothe new position and then copies everything that was written in this newlocation to the original location. Part of what is written can containadditional scripts that can write data of their own, this data shouldalso be copied to its correct position.

Following is an example of the principles of the invention that parsesan HTML page and postpones the script to the end of the page, whilemaking sure anything the scripts writes to the web page is then writtento the original position. With this aim, the exemplary script codeprovided in FIGS. 4A and 4B is added at the end of the <body> tag. Inaddition, every <script> tag in the page is identified. If the <script>tag is an external script, i.e., it has a “src” attribute, then thisattribute is saved to the variable SOURCE and deleted from the element.If the <script> tag already includes an “id” attribute, the “id”attribute is saved to the variable ID. The SOURCE and ID variables arekept in the memory when and where the page is being processed. If not, aunique id is generated, the “id” attribute is set to be this value andsaved to the variable ID. Then, the exemplary code shown in FIG. 4C isadded at the end of the <body> tag. For an internal script, i.e., thescript has content and does not have a “src” attribute, then thescript's content is saved to the variable CONTENT and then deleted. Ifthe script tag already includes an “id” attribute, it is saved in thevariable ID. If not, a unique id is generated, the “id” attribute is setto the generated value and then saved in the variable ID. Then, theexemplary code shown in FIG. 4D is added at the end of the <body> tag.

V. Acceleration Technique for Causing Items to be Fetched into aBrowser's Cache

By having data pre-stored in a browser's cache access time to the dataitem is reduced. Therefore, a need arises, at times, to bring data itemsto the browser's cache in advance or in anticipation of their futureuse. This pertains, for example and without limitation, toprefetching/preloading of a subsequent page or resources thereof,fetching resources of the same page earlier or fetching resources inparallel to the loading of the page, and the likes. Once the resourcesare in the cache of the browser, the browser rather than accessing thedata item remotely could fetch them from the browser's cache withoutconnecting to an external server to read data times, hence be exposed todelays.

A couple of solutions are shown to achieve the desired results. A firstapproach is used with respect to AJAX, which is a mechanism supported bytypical browsers to read from a server asynchronously. The code whichinitiates an AJAX request receives an event once a page's resource isloaded or, otherwise, in case of an error. Using this mechanism, anyresource required in the future or that needs to load in parallel can befetched. If the purpose is to load the resource in parallel, theresource is used upon the completion event. While appropriate in somecases, this mechanism is limited to fetching resources from the originaldomain only, that is, resources located in a different domain cannot befetched. A second approach is to use HTML tags which load externalresources. These tags are placed in the text of the HTML, or anyreferenced external resource, or otherwise inserted dynamically into theDOM using a scripting language. The tags can be, but are not limited to,“link”, “script” and “image”. If anything needs to be done when aresource finishes loading, an event handler, e.g., “onload” or “onerror”handlers, respective of these tags is used. When using a tag to load aresource it was designed to use, e.g., using SCRIPT tag to load aJavascript file or using a LINK tag to load a stylesheet, the tags mustbe configured to load only that resource and do nothing else. For ascript tag, it can be achieved, among others, using its TYPE attribute;for a link tag, its MEDIA attribute, and others, may be used. Some ofthese tags stop the processing of the document when used, so they areinferior when used for the required purpose. However, all these tags letthe page load a resource from any domain and is therefore a moreflexible solution. Instead of creating tags, the same technique may beused by creating script objects. For example, instead of creating an“image” tag, a new Image object can be created. Pointing the Imagesource to the relevant file achieves the same purpose without actuallyintroducing new tags to the DOM.

In one embodiment of the invention, a post-processing tool parses a webpage prepared by a developer for tagging the scripts in accordance withthe principle described above. In another embodiment, the WPAA 140intercepts the web page and parses it prior to sending it out to a usernode 130. The original web page may reside in the BEC 144. Theacceleration server 142 based on instructions contained in the storage148 parses the web page in accordance with the invention described aboveand tags the scripts in a modified web page, which may also be stored inthe FEC 146 for future use by other user nodes 130.

VI. Pre-Fetching Resources of the Same Page

The sequence of loading a web page, along with its resources isinefficient. The protocols do not utilize the network to use the entireavailable bandwidth at all times. Thus, as the page is parsed andscripts executed, every resource is read from the network onlyimmediately prior to its use. However, in many cases it is possible tobring data much earlier in the page load process. This is specificallyuseful during periods where the network's bandwidth is not fullyutilized.

In accordance with the principles of the invention, the web page'sresources are fetched earlier during the load sequence of the web pageusing one or more of the “Techniques of Bringing Items to the Browser'sCache” discussed herein. This way, the network is better utilized andwhen the resource is needed, it is already in the cache, thus it is notnecessary to read it from the network again.

In one embodiment of the invention, a post-processing tool parses a webpage prepared by a developer and inserts the code which loads page'sresources to the cache earlier in the page in accordance with theprinciple described above. The decision about which resources topre-fetch and where in the HTML to put the pre-fetch code can be hardcoded, configurable, or deduced by the tool. In another embodiment, theWPAA 140 intercepts the web page and parses it prior to sending it outto a user node 130. The original web page may reside in the BEC 144. Theacceleration server 142 based on instructions contained in the storage148 parses the web page in accordance with the invention described aboveand inserts the code which loads it to the cache earlier in the page,which may also be stored in the FEC 146 for future use by other usernodes 130.

VII. Automatic Generation of Image Containers

In many web pages, most of the requests to the server are made to bringimages. As every request includes a “handshake” with the web server andmany times TCP connection time, every such a request has an overhead.One way to deal with the problem is to combine two or more images in asingle image container, then a browser can fetch the two or more imagesusing only one request. One known technique to create such a containeris typically referred to as CSS sprite. This technique is to combineseveral images into one “tapestry” image, referred to as a “sprite” andto bring it in a single request. Then, a CSS is used to define differentregions in the combined image and enable the use of each such a regionas a standalone image. This technique has been used till today inseveral ways: a) manually combining images into a sprite as part of thedesign on a web site; or, b) there are web sites which allow a user toupload a series of images and download the combined image and the CSSfile which the browser will use to separate it back to the originalimages. Combining images can be also done by using the MHTML format(understood by the Microsoft Internet Explorer browser), the data:uriformat (understood by most web-kit based browsers such as MozillaFirefox), and others.

Existing solutions automatically combine every a few images in a webpage into a sprite. This combination is created by in the web server,thus the web page is transformed before it ever leaves the server on itsway to the end-user. There are two problems with the mechanism: a) forweb pages with dynamic data, many times only part of the images iscommon to all the instances of the web page and other images change. Forexample, the home page of Facebook contains different images fordifferent users, but the images that create the background are alwaysthe same. Thus, images cannot be blindly combined. When designing a website, sprites can be designed to automatically separate between thedifferent kinds of images (as it knows the structure of the web site). Asystem which is placed outside the web server does not have thisknowledge; and b) there is a conflict between the need to put as manyimages as possible in the sprite (to reduce latency) and the fact thatno image will be displayed until the entire sprite is brought from theserver and thus there is a need to put fewer images in the sprite.

In accordance with the principles of the invention, the solution is amechanism that decides which images should be placed in every imagecontainer. The factors are, but not limited to, which images are commonto all instances of a web page and what images are visible on a commondisplay when the web page is loaded. In the case of images that arecommon to every instance of a page a hard coded approach may be used, aconfiguration notification, or otherwise learned by the system over timeby analyzing the web pages passing though it and/or images passingthrough it. In the case of images visible on a display the size of thedisplay can be determined automatically by analyzing the incomingheaders, heuristically, by assuming common display sizes or both. Oncethe display size was determined, one or more containers can begenerated. For example, one container may be generated for the visibleitems and one container for the items outside the immediate or initialdisplay boundaries, i.e., those display items that the user needs toscroll to. Alternatively, a container may be generated for the visibleimages and no container at all for the ones outside the visible area.Other criteria may be used, for example, all the images which create thebackground should be part of one container and all the other images maybe divided between other containers/left alone (even if other images arecommon to all instances of the page and are in the visible area).Another embodiment may use a criterion of placing the images common toall users in one container and then placing the images which changeamong requests from different users into another container.

In one embodiment of the invention, a post-processing tool parses a webpage prepared by a developer for creating the sprites in accordance withthe principles described above. In another embodiment, the WPAA 140intercepts the web page and parses it prior to sending it out to a usernode 130. The original web page may reside in the BEC 144. The server142 based on instructions contained in the storage 148 parses the webpage in accordance with the invention described above and generates thesprites for the modified web page. The modified web page may also bestored in the FEC 146 for future use by other user nodes 130.

VIII. Postponing of iframes

iframes are pieces of a HTML page which are other HTML pages. Everyiframe has its own address, so every iframe requires one or morerequests. iframes are supposed to load and run in parallel to the parentdocument, but in practice it is not always so and many time theyintroduce a delay to the loading of the page.

In most cases, the content inside the iframe is not the primary contentof the web site, and many times not even in the area that is visiblewhen the site is loaded. Thus, the iframe tags in the <html> tag can bereplaced by placeholders, for example without limitations, tags with aunique id, and a code can be inserted further in the html which puts aniframe tag into its original placeholder. The placeholder can be anempty iframe tag and the code just directs the tag to the address theoriginal iframe pointed.

In one embodiment of the invention, a post-processing tool parses a webpage prepared by a developer for tagging the iframes in accordance withthe principles described above. In another embodiment, the WPAA 140intercepts the web page and parses it prior to sending it out to a usernode 130. The original web page may reside in the BEC 144. Theacceleration server 142 based on instructions contained in the storage148 parses the web page in accordance with the invention described aboveand for tagging the iframes for the modified web page. The modified webpage may also be stored in the BEC 146 for future use by other usernodes 130.

IX. Splitting Combined Web Page Resources

Many of web page load time optimization techniques include combining webpage's resources such as, but not limited to, images, style sheets,Javascripts, and others. The aim is to reduce the impact of latency andserver side request processing time. However, the farther a client isfrom a server, the worse the bandwidth between them is. Therefore,though the impact of latency and request processing time is reduced, theentire data may be transferred more slowly than it would otherwise besent in its non-combined state.

According to an embodiment of the invention, the combined resource issplit into several, but not many, containers that are downloaded at thesame time. The number of containers is between a predefined range (upperlimit and lower limit) that is set to a value to overcome aconnection-per-domain limit of a user's browser. In some cases, thecontainers should be downloaded from different domains/sub-domain toovercome the browser's connection-per-domain limit. For example,combining two hundred small images into four CSS sprites would be moreefficient than either leaving the two hundred images as is or combiningall of them into one big sprite. It should be noted that this embodimentmay be performed by a post-processing tool or the WPAA 140.

X. Viewport Prioritization

Once a webpage is loaded, only a part of it is immediately visible. Thevisible part is called “viewport”, which basically is everything that isviewable “above the fold”, and to increase the speed a web page isloaded, as far as the user experience is concerned, this part should befully loaded before the invisible part starts loading. Many times, theorder of page's resources, e.g., iframes, Javascript files, CSS files,images, and so on, as they are defined in the HTML web page, does notcorrespond to their actual location on the screen. The browser requeststhe resources in the order they are placed in the HTML file, due to thesequential nature of the parsing of the HTML file. However, in manycases, this causes resources appearing lower in the screen, andtypically resources that do not appear in the viewport at all, to befetched before they are needed. This causes some resources in theviewport to be fetched later than actually would be beneficial to theuser, reduces utilization of bandwidth, and unnecessarily usesconnections whose number is limited by the browser.

According to an embodiment of the invention, the viewport prioritizationsolution consists of two parts. The first part is a script that runs onthe web page and collects information about the location of everyelement of the web page, including elements that are defined insideiframes. This script reports to the server the collected data, in eitherraw or processed form. The second part is a component which analyzes thecollected data. For the combined resources, the order of the resourcesin the combined files is defined according to their position in thescreen, sorted by their position with respect of the Y-axis. For theregular resources, a script is added to the beginning of the web pagewhich asynchronously preloads the resources according to their positionin the screen. Thus, when the browser tries to fetch the resource duringrendering, the resource is already in the cache. Also, all the resourceswhich are not in the viewport are postponed until all the resources inthe viewport are loaded. The viewport is typically determined separatelyfor every user during the rendering of the page, or definedheuristically for all clients/groups of clients. It should be noted thatthis embodiment may be performed by the post-processing tool or the WPAA140.

XI. Background Image Management for Web Pages

On a typical web page, part of the images are actual images defined by<img> tags on a page and part of the images are background imagesdefined in various styles. All the images from the <img> tags arefetched when the page is loaded, but not all the images defined in thestyle are fetched. Only when an element uses the style is the imagefetched. When statically analyzing the web page (for pre-fetch, imagecombining or any other purpose), it is difficult to understand whichimages are actually part of the page and which images are just definedin the styles but are not actually used by the page.

The solution is based on a client side script (Javascript, for example)which scans, according to predefined criteria, some or all elements inthe DOM. This script reads the effective style of every such element andchecks whether this style contains a background image and if it doeswhich image is it. Then, the script sends the gathered information tothe server where it can be used for optimization techniques, such asimage combining, sorting image loading according to the visual positionon the page and pre-fetching. It should be noted that this embodimentmay be performed by a post-processing tool or the WPAA 140.

XII. Progressive Loading of Combined Resources of a Web Page

When combining resources trivially, every one of these resources isavailable only once the entire combined resource is loaded. Thispostpones the rendering of the first resource in the file until laterresources are loaded.

According to an embodiment of the invention, the combined file is loadedprogressively. For example, when loading a resource using AJAX, browsersread the resource chunk by chunk, returning the control to the AJAXcallback function after each chunk. Thus, the following process can beused in the AJAX callback function to achieve progressively loading:

1) Checking that the function is called after a chunk and not because ofan error;

2) Adding the new chunk to the existing chunks' buffer;

3) Parsing the chunks' buffer;

4) If new resources were found in the updates buffer then:

-   -   4a) For every one of the new resources:        -   Finding all elements which use the new resource, for            example, image tags which point to the resource, now a part            of the combined file; and        -   Replacing the address the elements point to by the new            resource is fully loaded and is now in the cache.

One example of using such a method is for the data:uri mechanism inmodern browsers. Using it naively causes the browser to wait until theentire combined file is loaded. When applying the disclosed method,every time a resource finishes loading, it can be used by any elements,and placed by the script for use.

An addition to the process is to progressively load background images.Background images do not include any element, thus cannot be used in themanner described above. However, the following process can be applied:

1) Combining the style sheet definitions that contain background URLsto, for example one combined file, which also contains the data of theimages. It should be noted that in some cases several combined files maybe created; 2) Reading the combined file using AJAX;

3) Every time the control returns to the AJAX callback function thefollowing is performed:

-   -   3a) Adding the new chunk to the read data array;    -   3b) Parsing the read data array to identify if any new classes        were added;

and

-   -   3c) For every new class added, preferably in full, the new class        is applied to the web page.        It should be noted that this embodiment may be performed by a        post-processing tool or the WPAA 140.

The principles of the invention can be implemented as hardware,firmware, software or any combination thereof. Moreover, the software ispreferably implemented as an application program tangibly embodied on aprogram storage unit, a non-transitory computer readable medium, or anon-transitory machine-readable storage medium that can be in a form ofa digital circuit, an analogy circuit, a magnetic medium, or combinationthereof. The application program may be uploaded to, and executed by, amachine comprising any suitable architecture. Preferably, the machine isimplemented on a computer platform having hardware such as one or morecentral processing units (“CPUs”), a memory, and input/outputinterfaces. The computer platform may also include an operating systemand microinstruction code. The various processes and functions describedherein may be either part of the microinstruction code or part of theapplication program, or any combination thereof, which may be executedby a CPU, whether or not such computer or processor is explicitly shown.In addition, various other peripheral units may be connected to thecomputer platform such as an additional data storage unit and a printingunit.

The foregoing detailed description has set forth a few of the many formsthat the invention can take. It is intended that the foregoing detaileddescription be understood as an illustration of selected forms that theinvention can take and not as a limitation to the definition of theinvention. It is only the claims, including all equivalents that areintended to define the scope of this invention.

1. A system for acceleration of access to web pages, comprising: anetwork interface enabling communication of one or more user nodes withone or more web servers over a network for accessing web pages stored inthe one or more web servers; an acceleration server coupled to thenetwork interface for modifying web pages retrieved from the one or moreweb servers using at least one acceleration technique, the modified webpages accelerating access to the web page to one or more user nodes; afirst cache connected to the acceleration server and the one or moreuser nodes and operative to cache information associated with requestsdirected from the one or more the user nodes to the acceleration server;a second cache connected to the acceleration server and the one or moreweb servers and operative to cache information associated with requestsdirected from the one or more web servers to the acceleration server;and a memory coupled to the acceleration server and containing aplurality of instructions respective of the at least one accelerationtechnique.
 2. The system of claim 1, wherein the at least oneacceleration technique comprises: forcing style data of a web page ofthe web pages to load in parallel to the rest of the data of the webpage by moving the style data from the web page into at least anexternal file to the web page; loading the at least external fileasynchronously to loading of the web page.
 3. The system of claim 1,wherein the at least one acceleration technique comprises: scanning aweb page of the web pages for embedded scripts; adding a script tag ateach embedded script location; and moving each embedded script to anexternal file, wherein upon processing of the web page on a browser theoriginal position of the script is processed without delay and whenreaching execution of the script, the script is being written at itsoriginal position in a digital object model (DOM) of the web page. 4.The system of claim 3, further comprises: identifying functions thatwhen executed from a new location in the web page cause erroneousbehavior; replacing the identified functions with new functions;converting text passed to the identified functions into a sub-tree ofthe DOM; and inserting the sub-tree into the DOM at a marked location.5. The system of claim 1, wherein the at least one accelerationtechnique comprises: causing items of a web page of the accessed webpages to be loaded into a browser's cache by performing at least one of:fetching the items into the browser's cache in parallel to the loadingof the web page; fetching, into the browser's cache, the items that arein a same domain of the web page; and asynchronous loading the itemsthat are not in the same domain of the web page into the browser's cacheby using tags of loading external resources.
 6. The system of claim 5,wherein further comprises: causing a post-processing parsing tool toinsert into the web page a pre-fetch code that when executed by thebrowser on a user node causes the pre-fetch of items in the web pageinto the browser's cache.
 7. The system of claim 1, wherein the at leastone acceleration technique comprises: placing images of a web page in aplurality of image containers depending on image parameters; anddetermining a priority between the plurality of image containers suchthat an image container having a higher priority is loaded prior to animage container having a lower priority.
 8. The system of claim 7,wherein the image parameters are at least one of: image size, imagevisibility within a display area, and image commonality.
 9. The systemof claim 1, wherein the at least one acceleration technique comprises:modifying a web page by replacing at least each iframe tag in a web pagewith a placeholder; and storing the modified web page in the secondcache.
 10. The system of claim 9, wherein the placeholder is a taghaving a unique identification.
 11. The system of claim 1, wherein theat least one acceleration technique comprises: splitting a combinedresource of the web page into a plurality of containers, wherein thenumber of container is above a predefined lower limit and below apredefined upper limit.
 12. The system of claim 11, wherein thepredefined lower limit and the predefined upper limit is determined toovercome a connection-per-domain limit of a user's browser.
 13. Thesystem of claim 1, wherein the at least one acceleration techniquecomprises: collecting elements of a web page including elements definedinside at least iframe tags; sorting of resources to be displayed on auser node from the collected elements with respect of the Y axis of thedisplay and postponing resources that are not in the viewport of thedisplay; modifying the web page by adding a script to the web page toasynchronously load elements which are in the viewport; and storing themodified web page in the second cache.
 14. The system of claim 1,wherein the at least one acceleration technique comprises: determiningif a page style of a web page contains a background image by analyzingthe digital object model (DOM) of the web page using a client sidescript executed on one of the user nodes; and informing the accelerationserver about the background image for the purpose of use of suchinformation for future optimizations of the web page.
 15. The system ofclaim 1, wherein the at least one acceleration technique furthercomprises: parsing the web page to detect embedded scripts; andreplacing the embedded script at the of the web page, thereby postponingexecution of the embedded scripts after the web page has beendownloaded, wherein upon execution of an embedded script any datawritten by the embedded script is written at the original position ofthe embedded script.
 16. A method for acceleration of access to a webpage, comprising: receiving a web page responsive to a request by auser; analyzing the received web page for possible accelerationimprovements; generating a modified web page of the received web pageusing at least one of a plurality of acceleration techniques; providingthe modified web page to the user, wherein the user experiences anaccelerated access to the modified web page resulting from the executionof the at least one of a plurality of acceleration techniques; andstoring the modified web page for use responsive to future userrequests.
 17. The method of claim 16, wherein the at least one of theplurality of acceleration techniques comprises: forcing a style data ofthe web page to load in parallel to the rest of the data of the web pageby moving the style data from the web page into at least an externalfile to the web page; and loading of the at least an external fileasynchronously to loading of the web page.
 18. The method of claim 16,wherein the at least a one of the plurality of acceleration techniquescomprises: scanning the web page for embedded scripts; adding a scripttag at each embedded script location; and moving each embedded script toan external file, wherein upon processing of the web page on a browserthe original position of the script is processed without delay and whenreaching execution of the script, the scrip is being written at itsoriginal position in a digital object model (DOM) of the web page. 19.The method of claim 18, further comprises: identifying functions thatwhen executed from a new location in the web page cause erroneousbehavior; replacing the identified functions with new functions;converting text passed to the identified functions into a sub-tree ofthe DOM; and inserting the sub-tree into the DOM at a marked location.20. The method of claim 16, wherein the at least one of the plurality ofacceleration techniques comprises: causing items of the web page to beloaded into a browser's cache by performing at least one of: fetchingthe items into the browser's cache in parallel to the loading of the webpage; fetching, into the browser's cache, the items that are in a samedomain of the web page; and asynchronous loading the items that are notin the same domain of the web page into the browser's cache by usingtags of loading external resources.
 21. The method of claim 20, furthercomprises: inserting into the web page a pre-fetch code that whenexecuted by the browser on a user node causes the pre-fetch of items inthe web page into the browser's cache.
 22. The method of claim 16,wherein the at least one of the plurality of acceleration techniquecomprises: placing images of a web page in a plurality of imagecontainers depending on image parameters; determining a priority betweenthe plurality of image containers; and loading an image container havinga higher priority prior to loading an image container having a lowerpriority.
 23. The method of claim 22, wherein the image parameters areat least one of: image size, image visibility within a display area, andimage commonality.
 24. The method of claim 16, wherein the at least oneof the plurality of acceleration techniques comprises: modifying the webpage by replacing at least each iframe tag in the web page with aplaceholder; and storing the modified web page in a cache memory. 25.The system of claim 24, wherein the placeholder is a tag having a uniqueidentification.
 26. The method of claim 16, wherein the at least one ofthe plurality of acceleration techniques comprises: splitting a combinedresource of the web page into a plurality of containers, wherein thenumber of containers is above a predefined lower limit and below apredefined upper limit.
 27. The method of claim 26, wherein thepredefined lower limit and the predefined upper limit is determined toovercome a connection-per-domain limit of a user's browser.
 28. Themethod of claim 16, wherein the at least one of the plurality ofacceleration techniques comprises: collecting elements of the web pageincluding elements defined inside at least iframe tags; sorting ofresources to be displayed on a user's screen from the collected elementswith respect of the Y axis of the display and postponing resources(elements?) that are not in the viewport of the display; modifying theweb page by adding a script to the web page to asynchronously loadelements which are in the viewport; and storing the modified web page ina cache memory.
 29. The method of claim 16, wherein the at least one ofthe plurality of acceleration techniques comprises: parsing the web pageto detect embedded scripts; and replacing the embedded scripts at theend of the web page, thereby postponing execution of the embeddedscripts after the web page has been downloaded, wherein upon executionof the of an embedded script any data written by the script is writtenat the original position of the embedded script.
 30. A non-transitorycomputer readable medium having stored thereon instructions for causingone or more processing units to execute the method according to claim16.